Addressing Limited Data for Textual Entailment Across Domains
نویسندگان
چکیده
We seek to address the lack of labeled data (and high cost of annotation) for textual entailment in some domains. To that end, we first create (for experimental purposes) an entailment dataset for the clinical domain, and a highly competitive supervised entailment system, ENT, that is effective (out of the box) on two domains. We then explore self-training and active learning strategies to address the lack of labeled data. With self-training, we successfully exploit unlabeled data to improve over ENT by 15% F-score on the newswire domain, and 13% F-score on clinical data. On the other hand, our active learning experiments demonstrate that we can match (and even beat) ENT using only 6.6% of the training data in the clinical domain, and only 5.8% of the training data in the newswire domain.
منابع مشابه
Acquiring entailment pairs across languages and domains: A Data Analysis
Entailment pairs are sentence pairs of a premise and a hypothesis, where the premise textually entails the hypothesis. Such sentence pairs are important for the development of Textual Entailment systems. In this paper, we take a closer look at a prominent strategy for their automatic acquisition from newspaper corpora, pairing first sentences of articles with their titles. We propose a simple l...
متن کاملTextual Entailment as an Evaluation Framework for Metaphor Resolution: A Proposal
We aim to address two complementary deficiencies in Natural Language Processing (NLP) research: (i) Despite the importance and prevalence of metaphor across many discourse genres, and metaphor’s many functions, applied NLP has mostly not addressed metaphor understanding. But, conversely, (ii) difficult issues in metaphor understanding have hindered large-scale application, extensive empirical e...
متن کاملCombining Specialized Entailment Engines for RTE-4
The main goal of FBK-irst participation at RTE-4 was to experiment the use of combined specialized entailment engines, each addressing a specific phenomena relevant to entailment. The approach is motivated since textual entailment is due to the combination of several linguistic phenomena which interact among them in a quite complex way. We were driven by the following two considerations: (i) de...
متن کاملRecognizing Textual Entailment Using a Subsequence Kernel Method
We present a novel approach to recognizing Textual Entailment. Structural features are constructed from abstract tree descriptions, which are automatically extracted from syntactic dependency trees. These features are then applied in a subsequence-kernel-based classifier to learn whether an entailment relation holds between two texts. Our method makes use of machine learning techniques using a ...
متن کاملNormalized alignment of dependency trees for detecting textual entailment
In this paper, we investigate the usefulness of normalized alignment of dependency trees for entailment prediction. Overall, our approach yields an accuracy of 60% on the RTE2 test set, which is a significant improvement over the baseline. Results vary substantially across the different subsets, with a peak performance on the summarization data. We conclude that normalized alignment is useful f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1606.02638 شماره
صفحات -
تاریخ انتشار 2016